Exact Asymptotic Results for a Model of Sequence Alignment

نویسندگان

  • Satya N. Majumdar
  • Sergei Nechaev
چکیده

Finding analytically the statistics of the longest common subsequence (LCS) of a pair of random sequences drawn from c alphabets is a challenging problem in computational evolutionary biology. We present exact asymptotic results for the distribution of the LCS in a simpler, yet nontrivial, variant of the original model called the Bernoulli matching (BM) model which reduces to the original model in the c → ∞ limit. We show that in the BM model, for all c, the distribution of the asymptotic length of the LCS, suitably scaled, is identical to the Tracy-Widom distribution of the largest eigenvalue of a random matrix whose entries are drawn from a Gaussian unitary ensemble. In particular, in the c → ∞ limit, this provides an exact expression for the asymptotic length distribution in the original LCS problem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Application of the ABS LX Algorithm to Multiple Sequence Alignment

We present an application of ABS algorithms for multiple sequence alignment (MSA). The Markov decision process (MDP) based model leads to a linear programming problem (LPP), whose solution is linked to a suggested alignment. The important features of our work include the facility of alignment of multiple sequences simultaneously and no limit for the length of the sequences. Our goal here is to ...

متن کامل

Superlinearly convergent exact penalty projected structured Hessian updating schemes for constrained nonlinear least squares: asymptotic analysis

We present a structured algorithm for solving constrained nonlinear least squares problems, and establish its local two-step Q-superlinear convergence. The approach is based on an adaptive structured scheme due to Mahdavi-Amiri and Bartels of the exact penalty method of Coleman and Conn for nonlinearly constrained optimization problems. The structured adaptation also makes use of the ideas of N...

متن کامل

A generalization of Profile Hidden Markov Model (PHMM) using one-by-one dependency between sequences

The Profile Hidden Markov Model (PHMM) can be poor at capturing dependency between observations because of the statistical assumptions it makes. To overcome this limitation, the dependency between residues in a multiple sequence alignment (MSA) which is the representative of a PHMM can be combined with the PHMM. Based on the fact that sequences appearing in the final MSA are written based on th...

متن کامل

Asymptotic Analysis of Binary Gas Mixture Separation by Nanometric Tubular Ceramic Membranes: Cocurrent and Countercurrent Flow Patterns

Analytical gas-permeation models for predicting the separation process across  membranes (exit compositions and area requirement) constitutes an important and necessary step in understanding the overall performance of  membrane modules. But, the exact (numerical) solution methods suffer from the complexity of the solution. Therefore, solutions of nonlinear ordinary differential equations th...

متن کامل

Asymptotic Behavior of Weighted Sums of Weakly Negative Dependent Random Variables

Let be a sequence of weakly negative dependent (denoted by, WND) random variables with common distribution function F and let be other sequence of positive random variables independent of and for some and for all . In this paper, we study the asymptotic behavior of the tail probabilities of the maximum, weighted sums, randomly weighted sums and randomly indexed weighted sums of heavy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004